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Abstract 

We introduce semi-supervised data classification algorithms based on total variation (TV), Reproduc- 
ing Kernel Hilbert Space (RKHS), support vector machine (SVM), Cheeger cut, labeled and unlabeled 
data points. We design binary and multi-class semi-supervised classification algorithms. We compare 
the TV-based classification algorithms with the related Laplacian-based algorithms, and show that TV 
classification perform significantly better when the number of labeled data is small. 
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r— i- 1 Introduction 
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1.1 Notation 

Let {xi, yi}\<i<N denote N data points, where Xi £ R d is its attributes with dimension d, while y% £ 
{+1,-1} (binary classification) or %/i £ {l,...,c} (multi-class classification). The total number of data 
points is N including n labeled data and N — n unlabeled data. Hk is a Reproducing Kernel Hilbert 
Space (RKHS) with K : R dxd ->■ Sym(R) is an operator-valued, positive definite kernel. Finally, we use 
the abbreviation fa = f(xi). 

ON 

On 

2 Binary (two-class) data classification 

O" 2.1 Regularized Least Square (RLS) 

_ The standard RLS problem for binary classification is as follows 8 . Find a function / : R — s> R such 

that 



where r/, X > 0. Representer theorem states the existence of a minimizing function f*{x) = X^en K{x, xj)a* 
(or f(x) = K x a with matrix representation) and the norm of / in the RKHS is ||/|||f„ = a T Ka. Problem 



([T]) is equivalent to 



min ^\\y^K a \\ 2 2 + ^ a T Ka (2) 



Taking the derivative w.r.t. a provides the minimizer: 

a =( V K + XI n y 1 ( V y) (3) 

Finally, unseen data points are classified as follows: 

x 6 Ci if /*(a;) > (4) 
iGC 2 iff(i)<0 (5) 
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2.2 Laplacian-based RLS 

The Laplacian-based RLS problem for binary semi-supervised classification is as follows Q]: 



/£% 2 2" J 2 



lio/ll 2 

where ||-D/|| 2 = Siiejv 10 *^!/' — /^| 2 ~ f T L,f is the Dirichlet energy and L = D — W is the graph 
Laplacian. Observe that training data points are composed of n labeled points and N — n unlabeled 
points. Let us consider matrix J = diag(l, f , 0, 0) with the first n diagonal entries as 1 and the rest 
and y = [yi, y n ,0, 0] with N — n entries as 0. This allows to write J2 i€n (yi — f%) 2 = \\y — Jf\\2- 
Representer theorem states the existence of a minimizing function f*(x) = ~^2j GN K(x,Xj)a* exists. 
Problem (((J) is equivalent to 

min %\\y- JKa\\ 2 2 + ^a T Ka + ]-(Ka) T L(Ka). (7) 

a£R N 2 2 2 

Taking the derivative w.r.t. a provides the minimizer: 

a = (rjJK + \I N + jLK)' 1 (r 1 y) (8) 

Finally, unseen data points are classified as follows: 

x G Ci if f*(x) > (9) 
x G C 2 if f*(x) < (fO) 

2.3 Total Variation-based RLS 

The TV-based RLS problem for binary semi-supervised classification is as follows [6]: 

min 2 £( w - fif + ~ \\.f\\ 2 H K +1 £ ™« 1/* " /i I- ("J 



\\Df\\ 

where l-D/ll = Xi jgjv W *>J l/» ~ /i I ^ s ^ ne S ra Ph TV of function /. Unlike previous optimization problems, 
minimizing (|lf I) needs advanced optimization techniques as TV term is non-differentiable. However, 
recent advances in I 1 optimization provide efficient tools to deal with problem (|f f [) . In this work, we 
propose a splitting step coupled with an augmented Lagrangian method. Although one splitting variable 
is enough for minimizing (|1I[) . experimental observations suggest more accurate results using two splitting 
variables g, h. The proposed iterative optimization algorithm is as follows: 

(r +1 ,h n+1 ,g n+1 ) = min 2 || y - J h f 2 + * ||/||| K+ 7 \\ D g || + 

jeH K ,h,g I I 

< Xi,f — g > +^-11/ - 5ll 2 + < AJ, ^ - 3 > - slli (12) 

Aj = Aj +ri(/ -g T ) (13) 

\ n+1 \ )T- 1 / 1 n+ 1 n-|- 1 \ /1 a\ 

A 2 = A 2 +r 2 (h -g ^ ) (14) 
The sub-minimization problem w.r.t. / is: 

^ ^ll/fc + Yll/-(^^ (is) 

which solution is given by / n+1 = Ka n+1 , with 

q" +1 = (Ai N +nii:)- 1 (ri 5 n -A?) (16) 
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The sub-minimization problem w.r.t. h is: 

mm l\\y-J h f 2 + ?l\\h-(g-^)f 2 (17) 

hi I r 2 

which solution is given by 

h n+1 = (r,J + r 2 I N )-\ m + r 2 g n - AJ) (18) 
The sub-minimization problem w.r.t. g is: 

min ^[DgW + § \\g - (f + ±) f 2 + %\\g - {h + ^)\\% (19) 

which can be written as 

mm 7||D< 7 || + p — 2 (20) 

s I r± + r 2 

with zi = f + arm ^2 = ft + — ■ Different techniques can be applied to solve the TV ROF problem 



[9|- We use the primal-dual method 3_ which is guaranteed to converge in O(-^j), k being the iteration 
number. Finally, we project each function /, h, g on the unit ball (i.e. / n+1 <— N. n ) an d constraint 
them to be zero-mean (i.e. f n+1 <- f n+1 - mean(f n+1 )). 



We summarize the iterative algorithm: 

a n+1 = (\I N + r 1 K)- 1 (r 1 g n -Xl) (21) 
/ n+1 = Ka n+1 (22) 
ft n+1 = (r?J + r 2 J J v)- 1 (TO + r 23 "-A 2 l ) (23) 

ri + r 2 r\Z\ + r 2 z 2 2 

- g iw^yw -i 7. — \\9 T „ 

ri + r 2 



• II 7-1 II , '"1 T" ''2 ii ' 1*1 "T < 2*2 M 2 / n ,(\ 

3 = argmm 7 ||D 5 || + \\g — — 1| 2 (24) 



with zi = f + —, z 2 = h + — (25) 
n r 2 



= ^- 1, xiii ( 26 ) 
IIS n+1 ||2 

= S n+1 - mean( fl " +1 ) (27) 

2.4 Cheeger-based RLS 

The Cheeger-based RLS problem for binary semi-supervised classification is as follows: 



Si.ieJV Wi *i\fi fj\ 



s.t. fi = 2/i,Vi 6 n (28) 



/e-ffjc J2 ieN \.fi - median(f)\ 
Based on [2], the following algorithm is proposed: 

9 n+1 = f n +c.sign(n (29) 
e n+1 = RLS(<? n+1 ) (30) 

h n+1 = argmin, TV(h) + |- - e" +1 1|| (31) 

= ft^ 1 - median(ft n+1 ) (32) 

yi Vi e n 

T +1 (i) Vi£ra 

„n + l 

= tf.TT^xnr 04) 



where RLS(g) is as follow 



(33) 



mm ~\\e\\l K + -\\e-g\\l (35) 
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which solution is given by e n+1 — Ka* , with 

a ={\I + rK)~ 1 rg. (36) 

2.5 Support Vector Machine (SVM) 

The standard SVM method for binary classification is as follows 5 . Find a function / : K d K such 
that 

/jfcU.*"'"'*' (37) 
s.t. j/i(/, + 6) > l,i = 1, . . . ,n. 

where A > 0. To deal with non-separable case, the above problem can be rewritten with a slack variable 
/e „^tWl l|/ll?r * +/ ^' 

(38) 

B.t. W (/ t +6)>i-ei,i = i, 



L, ... , It/, 



6 > 0, i = 1, . . . , n 

Representer theorem states the existence of a minimizing function /* (x) = ^ gn ^(^i ^i) 01 ! an d || /|| h k 
a T Ka. Problem (f38|) is equivalent to 



A 

min -a T Ka + & 



(39) 



s -t- Vi(^2 K ( Xi > X ^ a i + & ) - 1 - * = 1 ' ' ' ' ' n ' 

3=1 

& > 0, i = 1, 

By using the Lagrangian multiplier technique, problem (139[) can be reformulated as: 
min -a T Ka + uf T l + /3 T (1 - £ - F (ifa + 61)) - flf £, 

«,«,/3,^6R",6GR 2 ^ ^ 1 ^ ^ " (40) 

s.t. /3j, % > 0, i = 1, . . . , n 

where f3, /3$ are Lagrangian multipliers, 1 is a vector whose elements are all ones, and Y = diag(yi, y„). 
Let us consider the Lagrangian optimality conditions. Taking the derivative w.r.t. b and setting to 
gives 

fY\ = => p T y = 0. (41) 
Taking the derivative w.r.t. £ and setting to gives 

fil - P - j8 e = < Pi < fi, i = 1, . . . , n. (42) 
Taking the derivative w.r.t. a and setting to gives 



By substituting (|43[) back into (|40|) . we reach the following dual optimization problem: 

s.t. /3 T j/ = 0, 

< /3j < i = 1, ■ • ■ , n 



(43) 



(44) 
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where Q = Y(^-)Y. The above problem can be solved using several efficient SVM solvers s.a. libSVM 
4 . Once the optimal /?* is obtained, it is straightforward to get the optimal a*: 



, Y/3* 



(45) 



and 

71 

r(x) = Y / ctK(x > x i ). (46) 

Finally, unseen data points are classified as follows: 

x e Ci if /*(a;) > (47) 
x e Ci if /*(a;) <0 (48) 

2.6 Laplacian-based SVM 

The Laplacian-based SVM problem with slack variable for binary semi-supervised classification is as 
follows Q]: 



ll-D/ll 

s.t. y^fr + b) >l-£i,i = l,...,N 
£i>0,i = l,...,N 



(49) 



By using Lagrangian multipliers technique, problem (|49[) becomes: 

min ^a^a + /if T l + ]-a T KLKa + /3 T (1 - £ - Y(Ka + 61)) - 

s.t. > 0,i = 1,...,N 

Applying the same steps as (|41l) . (|42l) and ((43)) , we get 

max /3 T 1 - -P T Q0, 

pm N 2 

S .t.fl T y = 0, (51) 
0<Pi<H,i = l,...,N 

where 

Q = F(AJ + ■yLKy 1 KY (52) 
Optimal a* is obtained by solving the following linear system: 

a* — (\I + r yLK)~ 1 Yf3* (53) 

and 

JV 

f*(x) = Y,aiK(x,Xi). (54) 

Finally, unseen data points are classified as follows: 

x 6 Ci if f{x)> (55) 
z 6 C 2 if f*(x) < (56) 
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2.7 Total Variation-based SVM 

The TV-based SVM for binary semi-supervised classification is as follows: 

min dl/||H Ji -+/*y)&+7 Yl w *,Afi ~ fi 



\\n/\\ (57) 
s.t. y i (f i + b)>l-ti t i = l,...,N 
£i>0,i = l,...,N 

where \\Df\\ = Y2i j£N Wi ^\f i ~ fi\ * s ^ ne S ra Ph TV of function /. Like for TV-based RLS, we use 
a splitting step coupled with an augmented Lagrangian method. The proposed iterative optimization 
algorithm is as follows: 

\ N 

(r + \h n+1 ,g n+1 ) = min ^||/||| K+M V>+ 7 ||I5 fl ]| + 

< \?,f- g>+ !l\\f-gf 2 + < \Z,h-g>+^\\h-g\\ 2 2 (58) 
s.t. Vi (hi + b)>l-£i,i = l,...,N (59) 

e<>o,< = i,...,iv 

(60) 

A? +1 = A? + n(/ n+1 - 5 n+1 ) (61) 
A£ +1 = \% + r 2 (h n+1 ~g n+1 ) (62) 

The sub-minimization problem w.r.t. / is: 

min hff HK + r -l\\f-(g-b-)\\l (63) 

Jt^K & Z '1 

which solution is given by J n+1 = Ka n+1 , with 

a n+1 = (Ai N +riK)- 1 (ri 5 n -A?) (64) 
The sub-minimization problem w.r.t. ft is: 

JV 
i — 1 



s.t. 2/i(/ij +6) > 1 = 1, ...,JV 

£i>Q,i = l,...,N 



0<Pi<H,i = l,...,N 



(65) 



where e = g — Problem (|65|) is equivalent to 

min rt T l + %\\h- eg + f (1 - £ - Y(A + b)) - /f £ 

s.t. >0,i = l,...,iV 

Applying the same steps as (|41l) . (|42|l and (|43|l did, we get 

max /? T l-i/3 T Q/3-/3 T P, 

/36K« 2 

s.t. /3 T y = 0, 



(67) 



where Q = ^ and P = Ye. Problem J6?J can be solved by gradient descent method, and the solution 
ft* can be used to obtain the optimal h : 

h n + l = ]_yp* + e ( g g) 
T2 
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The sub-minimization problem w.r.t. g is: 



min j\\Dg\\ + %\\g- (f + ^)f 2 + %\\g - (h+^)f 2 
a 2, n A T2 



which can be written as 



ll n ll i ri + 1 " 2 II riZl + 7-222 II 2 
mm 7II-D5H H \\g z—rz — II 2 



9 2 n + r- 2 

with z 1= f + ±L and z 2 = h + ^ . 
We summarize the iterative algorithm 

n+l 



{\I N + ri K)-\r ig n -\ n L ) 



a 

f n+1 = Ka n+1 

/3* = ma,x /3 T 1 - ll3 T QI3 - /3 T P, s.t. /3 T y = 0,0 < Pi <C,i 

/3SM N 2 

h n+1 = — Y/T + e 

g = argmin s j\\Dg\\ H — ||g — — 1| 2 



ri + r 2 



\n ^n 

with «i = / H , Z2 — h-\ 

ri r 2 



7=1+1 

s n + l at 9 



g n+1 = N, 

Hi/" 1 'II 2 

5 n+1 = ff" +1 -mean(r +1 ) 

2.8 Cheeger-based SVM 

The Cheeger-based SVM problem for binary semi-supervised classification is as 

Y^ijeN Wi >i \f* ~ fj I . , w. _ 

mm = — — - s.t. fi = Vi, V«en 

f£H K J2 ieN \fi - median(f)\ 

s.t. yi(fi + b) >l-£i,i = l,...,N 

£i>0,i=l,...,N 

Based on [2], the following algorithm is proposed: 

g n+1 = f n + c.sign(f n ) 

e n + l = SVM ( ff " + l) 

rpn 

h n+1 = argmin, TV(h) + —\\h - e n+1 \\ 2 2 
t n+1 = h n+1 - median{h n+1 ) 



n + l 



rn+1 



JV.-rr— 



Vi € n 

n + l 



where SVM(g) is as follow: 



min -||e||ff K + fi 6 + 7; \\e - 9 III 

s.t. 2/i(ei + 6) > = l,...,N 

&>0,i = l,...,N 
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Problem (|88[) is equivalent to 

min h\e\\l K+ ^ T ^ + ^\\e-g\\l + P T {l-^-Y{e + b))~pJi 

S.t. >0,i = l,...,JV 

Applying the same steps as (|4"TT) . (|42l and (|43|) . we get 

< /3« < AM = 1,. ..,JV 

where Q = YGY, P = rg T {G + G T )Y and G = (XI + rKy 1 K. The above problem can be solved by 
gradient descent method, and the solution j3* can be used to obtain the optimal a*: 

a* = (AJ + rK)- 1 {Yp* + rg) (91) 

and 

e' l+1 = Ka* (92) 

2.9 Experimental results 



# labels per class 


1 


5 


10 


50 


Lap-RLS 


18.09 


10.48 


7.77 


4.14 


Lap-SVM 


13.79 


9.84 


7.61 


4.77 


TV-RLS 


3.18 


3.16 


3.13 


3.16 


TV-SVM 


3.18 


3.13 


3.13 


3.08 


Cheeger-RLS 


4.06 


3.74 


4.03 


2.84 


Chceger-SVM 


3.87 


3.74 


4.00 


2.73 



Table 1: Binary semi-supervised classification algorithms tested on the sets of 4's and 9's from USPS dataset. 
Error is averaged over 10 runs with randomly selected labels. 



3 Multi-class data classification 
3.1 Laplacian-based RLS 

The Laplacian-based RLS problem for multi-class semi-supervised classification is as follows: 



.. , fEE(^-/i s ) 2 + iEii/ fc ii^ + |EE^i/*-/ii 2 . 

f = U )<i H K L • - ■- Z • - Z 



fe=i ij'eJV 



l|o/ fc ll 2 

C 

s.t. ^2fi=l, ft >0,Vi€iV (93) 
fc=l 

where the last constraint being the simplex constraint. Problem ()93[) is equivalent to 
, min |EE^-^) 2 + ^Ell/ fc |l^+lEll^ fc || 2 ' 

C 

s.t. f =g k ,J29* = 1. 9i >0,WieN (94) 
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This leads to the proposed iterative algorithm: 

(a k ) n+1 = argmin afcgItN - J k Ka k \\ 2 2 + -alKa k + ^(Ka k ) T L(Ka k ) 

+ r -\\Ka k -( g k -^-)\\l 

= {r]J k K + rK + XI N + jLK)~ 1 (riy k + rg k - \ k ) 
(f k ) n+1 = K(a k ) n+1 

(g k )" +1 = U J2gh=1 (f k + ^) 

The simplex projection is done by Michelot's method [7]. 
Finally, unseen data points are classified as follows: 

x G C k if fk( x ) = max ({//( a:: )}i<j<c) 
j 

3.2 Total Variation-based RLS 

The TV-based RLS problem for multi-class semi-supervised classification is as follows: 



f = (fi fclf fl„ Z z /z ' Z z ' z ' 

u '■■■' J fc = l »Gn fc=l ij'SiV 



c 

s.t. E/ fc W = 1 - / fe W>o,Vzeiv 

fc=i 

Problem (|100[) is equivalent to 

- ^ |EE^-/') 2 + ^Eii/ fc ii 2 ^ + |EiiA/ fc ii, 

c 

s.t. E^ = 1 ' ^ >0,Vi€iV 

k=l 

This leads to the proposed iterative algorithm: 

I • ^ I k T k r , k ■ ■ 2 . A T r , r ilr^ fc / fc \ i 

(q ) = argmm QfceK jv -||y - J Ka \\ 2 + ^a k Ka k + -\\Ka - (g - — )| 

= ( ?7 J fc A' + ri4: + A/ J v)- 1 ( 77 i/ fc + r/-A fc ) 
(/ fc ) n+1 = A"O fc ) n+1 



= argmin^ 7 \\Dg k \\ + Ug k - (f k + -) f 2 
y 2 r 

Es 



G? fc )" +1 = a Tgk=1 ( 9 k 



(/)«+! = AT, 



-fc\n+l 



■||(ff*)»+ l |[ a 

Finally, unseen data points are classified as follows: 



x £ C k if /fc (x) = max({/,*(i)}i<j< c ) 

3 



9 



3.3 Cheeger-based RLS 

The Cheeger-based RLS problem for multi-class semi-supervised classification is as follows: 

(109) 

The following algorithm is proposed: 

(g k ) n+1 = (f k r + c.sign((f k r) (110) 
(e fe ) n+1 = RLS((g fc ) n+1 ) (111) 

(h k )" +1 = a rgmm hk TV(h k ) + ^\\h k ~(e k r +1 \\ 2 2 (112) 

(t k ) n+1 = (h k ) n+1 - median({h k ) n+1 ) (113) 

fc~in+i _ / y k Vi en 

{t k )" +1 (i) \Ji£n 

(s k r +1 = n Esfc=1 (.s fc ) (115) 

(/T +1 = iv. ,,;:,/ ,.„ (ii6) 

W / ||(s fe )" +1 ||2 

where RLS(g) is exact the same as (|35[) . 

Finally, unseen data points are classified as follows: 

s £ C fc if ft (x) = max({/*(i)}Kj< c ) (117) 

3 

3.4 Laplacian-based SVM 

The Laplacian-based SVM for multi-class semi-supervised classification is as follows: 



k 1 2 



/=(/!,. ..,/<=)eH K ,6eM=,5eR JVx <= 2 jjrj 2 



IP/ fc || 2 

s.t. + 6 fc ) > 1 - , & > 0, i e N, k G c 

c 



Problem (|118l) is equivalent to 



/=(/l,...,/=)GH Jr ,b e R=,e e RiVx= 2 £j 2^ 



s.t. + 6 fc ) > l - it & > o, i e N, k e c 

c 

f k = g\ Eft =i. <?*>o,ViGiV 



Notes that, each f k can be solved independently by using the same procedure as below (superscript k is 
ignored for convenience): 

mm N ~\\f\\H K +rt T l + h T Lf+ r -\\f-e\\l 
feB K ,beM,ZER N 2 2 2 (118) 

s.t. yi(fi + b) >l-$i,£i>0,ieN 
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where e = g — -, and I is the Lagrangian multiplier. Problem (|118[1 is equivalent to 
min -a T Ka + ftfl + ^-a T KLKa + -\\Ka- e\\% 

beR,a,£,p,p ( GR N 2 2 2 

+ /3 T (l-£-F(ifa + fe))-/f£ 
s.t. ftft>0,t£JV 

Applying the same steps as (|4lT) . (j42l and (|43|) . we get 

max /3 T 1 - i/J T Q/3 - 1.P0, 
/3eR N 2 2 

s.t. /fy = 0, 

0< Pi<n,i = l,...,N 



(119) 



(120) 



where Q = YGY, P = re T (G + G T )y and G = (XI + jLK + rK)~ 1 K. The above problem can be solved 
by gradient descent method, and the solution /3* can be used to obtain the optimal a*, which is: 

a = (XI + jLK + rKy 1 (Yf3* + re) (121) 

and 

/ = Ka* (122) 
This leads to the following iterative algorithm: 

(/ ft ) n+1 = computed by using JT20), (pi) and ([122} (123) 

{gk)n+ i = n E( » =1 ((/»)» +1 + ^). (124) 

The simplex projection is done by Michelot's method [7]- 
Finally, unseen data points are classified as follows: 

x S G fc if (a:) = nuot({^(ar)}i<i<c) (125) 

3.5 Total Variation-based SVM 

The TV-based SVM for multi-class semi-supervised classification is as follows: 



. c c N c 



3 1 1 



IID/* 



s.t. yf (/* + fe fc ) > 1 - £*, > 0, * G N, k e c 



Problem (|126l) is equivalent to 



(126) 



. c c N c 

mm ^En/ fc ii^ + MEEer+7Eii^ fc ii, 

s.t. yfC/f + 6 fc ) > 1 - Cf, > 0, i G iV, k e c (127) 

c 

fc=i 

Notes that, each / fc can be solved independently: 

min „ dl/lk* +/4 T 1+ ~ll/-e||l, 
feH K ,bm,nm N 2 2 (128) 

s.t. Vi(fi + b) >!-&,&> 0,i e AT 
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where e = g — -, and I is the Lagrangian multiplier. Problem (1 1 28 p is equivalent to 

min ^a T Ka + + T -\\K a p T (1 - £ - + 6)) - Pf$ 

bem, a ,i^,i3 s em N 2 2 s (129) 

s.t. P, P^ > 0,i € N 
Applying the same steps as (|4Tj) , (|42]) and (|43|) , we get 

max p T l - \p T QP - ]-Pp, 

t « T n ( 13 °) 
s.t. y = 0, 

< Pi <fx,i = 1,...,N 

where Q = YGY, P = re T (G + G T )Y and G = (XI + rK)~ 1 K. The above problem can be solved by 
gradient descent method, and the solution P* can be used to obtain the optimal a*, which is: 

a = {XI + rKy 1 (Yp* + re) (131) 

and 

/ = Ka (132) 
This leads to the proposed iterative algorithm: 

(/*) n+1 = computed by using (fT30)> . JT3TJ and CE32) (133) 

(g k ) n+1 = Mgmm gkl \\Dg k \\ + ?-\\g k -(f k + l -)\\ 2 2 (134) 

2 r 

(-g k ) n+1 = n E9 ^ 1 (s fc ) (135) 

( fcj»+i = TV. ,, , ,, ttt; (136) 

Finally, unseen data points are classified as follows: 

x G C k if /J(a?) = max({/* (as) h<j<J (137) 
j 

3.6 Cheeger-based SVM 

The Cheeger-based SVM with slack variable problem for multi-class classification is as follows: 

2^2i,j£N W i,Afi ~ fj I , ,k k w - ^ ; 100 i 

mm > — t — — s.t. /» =Vi, Vi G n (138) 

The following algorithm is proposed: 

(g k r +1 = (f k y i + c.sign((f k r) (m 

(e k ) n+1 = SVM((/') n+1 ) (140) 

(h k )" +1 = zrgmm hk TV(h k ) + ^-\\h k -(e k r +1 \\ 2 2 (141) 

2c 

{t k ) n+1 = (h k ) n+1 - median((h k ) n+1 ) (142) 

j^n+i _ / Vi G n 

[t k ) n+1 (i) Vign 



( S T +1 = n Esfc=1 ( s fc ) (144) 



(/ fe ) n+1 = tf-TIT&bTn- ( 145 ) 



(s fc )™ +1 



where SVM(-) is as 
Finally, unseen data points are classified as follows: 



x G C k if flip) = max({/*(x)}i<,< c ) (146) 
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3.7 Experimental results 



# labels per class 


1 


5 


10 


50 


Lap-RLS 


20.06 


6.64 


4.03 


3.3 


Lap-SVM 


49.95 


14.21 


6.27 


2.82 


TV-RLS 


2.0 


2.06 


1.91 


1.98 


TV-SVM 


1.75 


1.82 


1.77 


1.85 


Chccgcr-RLS 


3.35 


1.95 


1.85 


1.87 


Cheeger-SVM 


2.94 


2.08 


1.72 


1.74 



Table 2: Multi-class semi-supervised classification algorithms tested on four classes (0's, l's, 4's and 9's) 
from USPS dataset. Error is averaged over 10 runs with randomly selected labels. 
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